FlashMatrix: Parallel, Scalable Data Analysis with Generalized Matrix Operations using Commodity SSDs
نویسندگان
چکیده
FlashMatrix is a matrix-oriented programming framework for general data analysis with high-level functional programming interface. It scales matrix operations beyond memory capacity by utilizing solid-state drives (SSDs) in non-uniform memory architecture (NUMA). It provides a small number of generalized matrix operations (GenOps) and reimplements a large number of matrix operations in the R framework with GenOps. As such, it executes R code in parallel and out of core automatically. FlashMatrix uses vectorized user-defined functions (VUDF) to reduce the overhead of function calls and fuses matrix operations to reduce data movement between CPU and SSDs. We implement multiple machine learning algorithms in R to benchmark the performance of FlashMatrix. On a large parallel machine, out-of-core execution of these R implementations in FlashMatrix has performance comparable to in-memory execution on a billion-scale dataset while significantly outperforming the in-memory execution of Spark MLlib.
منابع مشابه
Using GPUs to Forensically Recover Non-Addressable Data on Hardware-FTL-Backed NAND Storage
We present work-in-progress towards a general, parallelized method to identify leftover data in nonaddressable regions of NAND flash storage devices for which the storage interface does not allow access to all the data regions (such as with thumb drives and SSDs). Significant amounts of data can wind up in non-addressable regions through regular use of the NAND flash storage that cannot be acce...
متن کاملOptimizing Database Operators by Exploiting Internal Parallelism of Solid State Drives
With the development of flash memory technology, flash-based solid state drives (SSDs) are gradually used in more and more devices and applications. In addition to characteristics of flash memory itself, a unique characteristic of SSDs, namely internal parallelism, should also be considered to improve performance of SSDs-based DBMSs, especially query processing. In this paper, we first describe...
متن کاملFMMU: A Hardware-Automated Flash Map Management Unit for Scalable Performance of NAND Flash-Based SSDs
NAND flash-based Solid State Drives (SSDs), which are widely used from embedded systems to enterprise servers, are enhancing performance by exploiting the parallelism of NAND flash memories. To cope with the performance improvement of SSDs, storage systems have rapidly adopted the host interface for SSDs from Serial-ATA, which is used for existing hard disk drives, to high-speed PCI express. Si...
متن کاملFast Processing of Large Graph Applications Using Asynchronous Architecture
Graph algorithms and techniques are increasingly being used in scientific and commercial applications to express relations and explore large data sets. Although conventional or commodity computer architectures, like CPU or GPU, can compute fairly well dense graph algorithms, they are often inadequate in processing large sparse graph applications. Memory access patterns, memory bandwidth require...
متن کاملDesign and Implementation of a Scalable Parallel System for Multidimensional Analysis and OLAP
Multidimensional Analysis and On-Line Analytical Processing (OLAP) uses summary information that requires aggregate operations along one or more dimensions of numerical data values. Query processing for these applications require different views of data for decision support. The Data Cube operator provides multi-dimensional aggregates, used to calculate and store summary information on a number...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1604.06414 شماره
صفحات -
تاریخ انتشار 2016